Performance Measures for Multi-Graded Relevance

نویسندگان

Christian Scheel

Andreas Lommatzsch

Sahin Albayrak

چکیده

We extend performance measures commonly used in semantic web applications to be capable of handling multi-graded relevance data. Most of today's recommender social web applications o er the possibility to rate objects with di erent levels of relevance. Nevertheless most performance measures in Information Retrieval and recommender systems are based on the assumption that retrieved objects (e. g. entities or documents) are either relevant or irrelevant. Hence, thresholds have to be applied to convert multi-graded relevance labels to binary relevance labels. With regard to the necessity of evaluating information retrieval strategies on multi-graded data, we propose an extended version of the performance measure average precision that pays attention to levels of relevance without applying thresholds, but keeping and respecting the detailed relevance information. Furthermore we propose an improvement to the NDCG measure avoiding problems caused by di erent scales in di erent datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cumulated Gain-based Indicators of Ir Performance

Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their abi...

متن کامل

Nugget-Based Computation of Graded Relevance

We propose a simple method for assigning graded relevance values to documents judged during the course of a retrieval experiment. In making this proposal, we aim to avoid the potential for ambiguity and greater cognitive load associated with standard graded relevance judgments. Under our proposal, we first decompose a retrieval topic into a number of informational nuggets. For each document, a ...

متن کامل

Survey of graded relevance metrics for information retrieval

A large number of metrics are available to evaluate the quality of rank web pages in information retrieval (IR). These metrics can be classified in different groups as follows: Binary Relevance, Graded Relevance, Rank Correlation Coefficient, and User Oriented Measures. Each group of metrics has difference characteristics. However, metrics that contains in the same group have the similar charac...

متن کامل

َA Multi-objective simulated annealing algorithm to solving flexible no-wait flowshop scheduling problems with transportation times

This paper deals with a bi-objective hybrid no-wait flowshop scheduling problem minimizing the makespan and total weighted tardiness, in which we consider transportation times between stages. Obtaining an optimal solution for this type of complex, large-sized problem in reasonable computational time by using traditional approaches and optimization tools is extremely difficult. This paper presen...

متن کامل

Predicting Relevance based on Assessor Disagreement

We present the Predicted Relevance Model (PRM): it allows moving from binary evaluation measures that reflect a single assessor’s judgments, towards graded measures that represent the relevance towards random users.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Performance Measures for Multi-Graded Relevance

نویسندگان

چکیده

منابع مشابه

Cumulated Gain-based Indicators of Ir Performance

Nugget-Based Computation of Graded Relevance

Survey of graded relevance metrics for information retrieval

َA Multi-objective simulated annealing algorithm to solving flexible no-wait flowshop scheduling problems with transportation times

Predicting Relevance based on Assessor Disagreement

عنوان ژورنال:

اشتراک گذاری